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Abstract 

We  explore  experimental  procedures  for  comparing  the  capabilities  of  complex  discrete  event 
service  sy.siems.  Instead  of  measuring  system  capability  by  analyzing  or  simulating  tL  system 
with  a  constant  rate  of  arriving  work,  system  capability  is  measured  as  the  maximum  rate  of 
work  arrival  for  which  the  system  has  a  steady  state.  Hence,  we  seek  tin  arrival  rate  which 
causes  the  system  to  be  at  full  capacity.  This  rate  is  arguably  the  best  indication  of  the  service 
system's  capability.  We  treat  both  work-conserving  and  non-work-conserving  service  systems, 

iidkiikiuiid-i  aiivi  ^pck.lall4cu  ui  d^Mciii  ]'t'i 

Introduction 

As  industrial  engineers,  applied  probabilists,  simulationists,  and  systems  analysts,  we  are  often 
called  upon  to  evaluate  systems  whicli  service  input  traffic  and  produce  finished  products.  These 
sytems  are  sometimes  traditional  queues  or  networks  of  queues,  but  are  often  systems  with  queue-like 
characteristics  which  cannot  accurately  be  modeled  as  traditional  queueing  sy.stems.  In  practice  and 
in  the  literature,  this  evaluaiiun  is  traditionally  based  on  excercising  a  model  of  the  service  system 
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by  subjecting  it  to  a  stream  of  input  traflic  and  estimating  or  calculating  some  expected  sy.-iem 
performance  measure. 

We  feel  that  this  typical  experimental  design  is  lacking,  and  that  the  shortcomings  stem  from 
the  arbitrary  choice  of  the  distribution  of  the  input  process.  Especially  iirobhunat  ic  are  cases  where 
the  service  system  being  modeled  does  not  currently  exist,  where  worst-case  behavior  is  soiighi. 
or  where  we  wish  to  evaluate  the  system  in  situations  which  are  not  accessible  for  data  collection. 
Practical  service  system  analysis  is  interesting  only  when  the  service  system  is  in  an  environment 
where  the  workload  is  high  relative  to  the  system's  capability  to  serve,  in  all  th.it  feilhms.  we  are 
interested  in  finding  the  intensity  of  the  input  process  that  taxes  the  service  system  to  the  extremi  s 
of  its  capabilities,  and  using  this  intensity  as  a  measure  to  compare  systems. 

1  The  General  Service  System  Model 

The  service  systems  considered  all  have  the  following  features: 

1.  a  centralized,  controlable,  nonlattice  proce.ss  which  gein  rates  t.usks  at  a  rate  A  per  iiiiit  time; 

2.  ta-sks  are  admitted  upon  generation  and  proce.s.sed  by  the  system: 

3.  a  completed  t.ask  is  ejected  from  the  system; 

4.  the  system  has  the  capability  to  process  as  many  as  /i  tasks  per  unit  time. 

We  will  call  such  systems  Discrete  EvPnt  Service  Systems  (DESSs),  .see  figure  1.  We  will  allow 
the  system  to  create,  combine,  destroy,  or  absorb  tasks  -  the  DESS  need  not  be  work  conserving 
A  DESS  must,  however,  be  mixed  or  open,  as  we  are  interested  in  how  much  work  we  can  inject 
into  the  system  without  overburdening  it.  We  may  measure  the  performance  of  the  system  using 
traditional  queueing  measures  such  as  tlie  number  of  taisks  resident  in  the  system  or  some  production 
cost,  or  we  may  opt  to  analyze  some  measures  which  are  particular  to  the  application  at  hand.  In 
this  work,  we  will  approach  the  work-conserving  and  nonwork-conserving  systems  sejierately. 
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-  Input  process 
wltJi  Intensity 
Mt) 


^  Output  process 
wltli  maxiniuin 
Intensity  \i 


Figure  1:  A  simple  DESS. 

1.1  Motivating  Example  DTiC  3 

Recently  a  model  was  constructed  of  all  of  the  single-channel  radio  cominiinications  in  a  Marine 
Expeditionary  Brigade  (MEB).  A  MEB  consists  of  approximately  20,000  marines  who  perform  in 
three  s\ibelements,  the  Giound  Combat  Element  (CCE),  the  Air  Cbmbat  Element  (AC'E),  and  the 
Combat  So  rviro  S'lppoft  El'’m'’n*  (C’SSE).  1  he  single-channel  radio  ne!''’ork  of  the  MEB  may  consist 
of  several  thousand  radios  operating  on  several  lumdred  radio  nets.  The  goal  of  the  study  was  to 
allocate  newly  purchased  antijamming  radios  to  tVie  different  units  in  the  MEB. 

The  radio  traffic  was  modeled  using  the  Marine  Corps’  version  of  a  structured  trafffic  model. 
There  are  sixty  classes  of  communication  task  packages,  each  representing  a  different  mi.ssion  that 
the  MEB  executes  in  battle.  Each  task  package,  called  a  Broad  Operr.tional  Subtask  (BOST),  was 
composed  of  several  tasks  which  were  called  Message  Exchange  Occurrances  (MEOs).  Each  BOST 
contained  between  five  (5)  and  sixteen  (16)  MEOs,  and  each  MEO  has  a  set  of  other  MEOs  in  the 
BOST  which  must  be  completed  before  it  can  be  initiated. 

The  MEOs  are  completed  by  transmittting  a  message  from  the  siiecified  sender  to  each  of  the 
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net,  the  sender  will  attempt  to  reach  the  receiver  using  other  nets,  relays  through  other  radios,  or 
using  messengers.  The  user  may  also  elect  to  delay  the  MEO  some  short  time  befiare  atteiiijitiiig  It 
again.  The  radios  in  the  system 

•  experience  failures; 

•  become  jammed  by  scenario-specified  jammers; 

•  hold  queues  of  MEOs  with  different  priorities; 

•  are  capable  of  moving  into  and  out  of  range  of  other  radios; 

•  can  be  used  in  voice  or  digital  mode; 

•  exit  and  join  different  radio  nets  as  they  move  around  or  attempt  to  rermiti-  MEOs. 

The  antijamming  radio  fails  less  frequently  and  is  harder  to  jam  than  the  old  radio.  However,  it  is 
much  more  time  consuming  to  enter  a  net  with  the  new  radio  than  with  the  old  one.  Finally,  if  an 
old  radio  tries  to  contact  a  new  radio,  the  new  radio  i.iust  transit  into  a  less  capable  mode  in  order 
to  receive  the  MEO,  then  reenter  his  regular  net  of  new  radios. 

Our  model  features  structured  traffic  being  presented  to  a  set  of  servicing  caj);'bil;tics  which  act 
serni-autonomously.  The  routing  of  the  MEOs,  sometimes  using  several  transmissions  to  accomjdish  a 
single  MEO,  makes  this  model  extremely  difficult  to  analyze.  To  exj)editc  the  analysis  and  to  acheive 
maximum  flexibility  and  sponsor  acceptance,  an  object-oriented  simulation  model  was  constructed 
to  allow  analysts  to  build  MEB  radio  net  structures  and  test  their  capabilities  against  on  another. 

This  brings  us  to  ttie  search  for  the  right  rate  of  generstion  of  BOSTs  iti  the  system.  This 
generation  rate  is  clearly  dependent  on  the  pace  and  nature  of  the  battle  being  experienced  by  tin- 
MEB.  After  some  initial  searching  through  volumes  of  data  from  the  recent  Desert  Storm  operation, 
we  found  that  the  Marine  Corps  didn’t  take  time  during  their  ground  war  to  record  the  time  and  type 
of  every  BOST  they  executed!  Experience  with  the  model  showed  that  the  measured  performance 
depended  greatly  on  the  pace  of  the  presented  traffic.  When  considering  various  C'H  architectures. 
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we  found  tlio  ranking  of  the  radio  allocations  from  host  to  worst  cliangi-d  as  ila’  Ht)S'l  gi-n. -ration 
rate  chatiged.  The  sponsor  wanted  to  identify  the  best  architect  tire  for  the  most  intense  traflic. 

2  Work  Conserving  Systems 

We  first  study  the  behavior  of  systems  where  there  is  a  one-to-one  corres|]ondence  between  the 
tasks  we  submit  for  processing  and  the  finished  tasks  the  service  system  product's.  Work-conserving 
tpieueing  models  do  not  allow 

•  tasks  to  expire  while  in  service; 

•  tasks  to  create  other  ttisks  while  in  service; 

•  tasks  to  be  sjdit  or  combined; 

•  ttisks  which  never  finish  service. 

Work-con.serving  queueitig  system  motlels  are  common  in  both  tin'  practice  and  litertiture  of 
api>lied  probability.  In  a  typical  experiment,  we  generate  input  tcj  th'  system  at  ;<  ci.uistant  rtite, 
monitor  the  performance  of  the  system  either  at  fixed  intervals  or  ujion  deji.arture  froyi  the  system, 
ai.d  employ  well-known  methods  of  steady-state  analysis  to  estimate  the  steady-state  average  of  the 
performance  measure. 

A  maxim  of  the  analysis  of  service  .systems  is  that  the  system  will  have  stationary  long-run 
behavior  if  and  only  if  the  number  of  arriving  tasks  are,  on  average,  le.ss  than  the  number  of  ttisks 
the  system  is  capable  of  processing.  If  our  overall  system  can  work  at  a  maximutii  of  fi  ttisks  jier 
utiit  time,  we  can  input  as  many  as  /<  per  unit  time  and  the  system  will  remain  stationary.  If  A  is 
qur  ariival  rate  for  the  system,  we  wish  to  manijiulate  A  to  exjiose  p. 

2.1  Generating  Data 

There  are  two  ways  we  can  generate  data  from  a  work-conserving  syst(  m  which  will  reveal  the 
maxitnum  processing  rate  in  the  system.  They  are; 
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•  input  tasks  to  tlie  system  at  a  rate  known  to  be  rnucli  liiglier  than  tin-  system  can  liaiulle; 

•  fill  the  system,  then  input  a  new  task  every  time  that  a  task  completes. 

Instead  of  choosing  a  very  high  ini>ut  rate  and  dealing  with  the  problems  of  exploding  buffer  contents 
and  a  nonrecurrent  system,  we  will  simply  close  off  the  system  and  recirculate  the  tasks  which  finish 
This  approach  will  also  serve  as  a  good  introduction  to  the  analysis  of  systems  which  l.i  not  conserve 
work. 

Thus,  wo  examine  a  special  kind  of  closed  queueing  network  one  with  a  singli  Iw.  qelnuk  which 
all  tasks  traverse.  Let  X(t)  be  the  time-dependent  rale  of  recirculation  of  tasks  in  tlie  system.  .So 
long  as  the  system  contain.s  enough  tasks  to  keep  it  working  at  rtijjarity.  we  have  ,\(/)  —  //  as 
t  —  cc.  Kelly  [3]  and  Walrand  [!)]  both  show  this  for  exponentially  distributed  service,  ami  Disney 
and  Kiessler  [2]  make  the  extension  to  Jackson  networks.  Tie'  result  ran  be  extended  in  the  obvious 
way  by  treating  Phase-tyjie  distributions  for  service  times,  to  produce  tin'  result  we  seek  (\(t)  —  It 
as  t  —  oo)  for  generally  distributed  service. 

Example 

We  will  demonstrate  this  method  on  the  Jackson  network  shown  in  figurt'  2. 

2.2  Detecting  Transition  to  Steady  State 

Let  us  simulate  the  completion  of  the  first  .V  customers  serviced  by  the  closed  .system  for  .1/  inde¬ 
pendent  replications.  Let  T,  j  be  the  j'*'  time  between  recirculation  during  the  i'^‘  re])lication.  Thus. 
Tij.i=  1,2,..., A/  is  a  set  of  iic/ samples.  Let  Tj  =^^LiT,j/M  be  the  average  recirculation  tiirn' 
process.  We  seek  the  index  A"  such  that  ETi  j  —  ETj  —  ft  for  all  j  >  A’*.  Hence,  we  are  in  the 
setting  of  a  traditional  initial  transient  detection  problem. 

There  exist  many  ways  to  tackle  this  problem,  including 

•  cress-replicaticin  confidence  intervals,  [10] 


•  tests  for  significant  drift: 
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recirculation 


Figure  2;  A  Jarksmi  Network  designed  to  liave  a  maxiniiini  service  rate  of  0.5.  I'lie  nnitiKers  in 
parentheses  are  t.lie  niimher  of  servers  at.  each  station,  and  llie  routing  prohaiulities  are  sliown  on 
the  workstation  connections.  All  servers  have  unit  service  time,  and  all  huffers  are  infinite.  The 
dashed  line  shows  the  recirculation  route  a<lded  to  force  the  system  to  serve  at  the  maximum  rate. 
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standardized  time  series  (STS),  [6], 


1ti  our  \vp  n<ofnl  pci><^ri;»nv  iti  V^l'TSiOIi  \V<'  liaw 


developed,  which  we  call  ratio  STS  (RSTS). 


2.3  Ratio  STS 


The  method  of  standardized  time  series  (STS)  [6],  produces  confidence  interval.s  from  autocor  related, 
stationary  data  This  method  was  used  in  [7]  to  detect  tlic  existence  of  itiitializ.it iciji  hicts  in 
simulation  output,  and  was  sliarpeneii  to  produce  optimal  tests  when  the  functional  form  of  the 
initialization  bias  is  known. 

Suppose  that  we  have  M  indei'endent  samples  of  n  points  etich,  with  ,  heinj;  the  j'*'  iioini  in 


the  t'^  independent  sample.  Let 


for  i  =  1,2 . M .  and  with  y,,o  =  0  for  each  i.  The  time  series  S,(lc),  k  =  1.2 . ii  is  const  ructed 

for  each  indenendent  renlication  i  as 


y,.u  -  >i,t-  for  0  <  k-  <  II 


k  =  0.  ti. 


Let  a  be  the  variance  of  ')) .  If  S’, (A’)  is  dividetl  by  a^/Tifk  and  scale  the  index  k  so  tlnit  the  result 
resides  in  the  unit  interval  [0,  1],  the  resulting  time  .series  T,(/),  0  <  t  <  1  is  known  to  ti)'pruxim;ite  a 
Brownian  bridge  as  n  —  oo.  This  is  the  fundamental  result  of  [6],  and  the  theoretical  basis  of  this 
sequencial  procedure. 

Schruben  shows  that  scaling  and  summing  7](t), 


A,  =  <T>/n  T,(kii) 


results  in  a  normal  random  variable  A,  with  variance  given  by 


VAR(A,)  = 


(r^7i{ir  —  1 ) 
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Note  that,  except  for  a  factor  of  er-.  VAR(Ai)  is  independent  of  the  tlata,  it  n  lii.-,  unly  on  the 
parameters  of  the  experiment.  Hence,  for  any  integer  d  <  M . 

12  A 

=  “^r-T^ — TT  2- 

—  1 ) 

The  original  .STS  used  lo  delect  iniliai  transients  used  as  a  lest  statistic  for  st ;it  ionarity  of  tin 
mean  response.  If  we  form  a  ratio  of  and  we  ran  eliminaii'  tin  iii  .  d  to  e.-.timato  r~. 

forming 

r  '^■'E;-t--5r 

^d.M-d - -  o - r-  (■) 

This  test  statistic,  wliich  we  call  the  RSTS  test  statistic,  is  t'asy  to  use  in  all  .'f  the  .•ipplicaiiuns 
where  STS  is  apjdietl.  In  particular,  if  wt  are  interested  in  determining  the  onset  of  st<’ady  state,  we 

can  form  the  hackward-rnoving  sequences  Ai  jJ  =  ti  -  l.ti  -  2 . 1  for  etich  replictitioii  i.  where 

A,  j  is  formed  from  the  sulisequence  =  j.j  +  1 . n.  the  [.urtion  ofthe  i''‘  replication  hetween 

j  and  ti.  Thus,  we  form  the  s-aiuence  of  F-statistics 


(M  —  (/)"’  Ei  =  .\;-d -'r ) 


If  we  assume  that  tlie  system  is  in  steady  state  when  each  of  the  .-1,  „  are  collected,  then  we  can  iletect 
the  transition  of  the  system  into  steady  .state  by  lookinc  at  the  first  index  .V"  wln  re  F.t  ,\i *) 
exceeds  the  critical  yalue  for  an  F  random  variable  with  identical  degree.v  of  freedom.  '1  his  'iiethod 
is  demonstrated  in  the  following  example. 


2.3.1  Example 


Continuing  with  tiie  work-con.serying  system  example,  suppose  that  we 
•  start  the  system  witli  25  tasks  enqued  at  workstation  1  at  time  0.0: 


simulate  .V  =  500  customer  recirculations; 


replicate  A/  =  20  times. 
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Figurp  3:  Trajpctory  of  tlie  mean  process  7'j  aiui  tlie  conruleiice  intervals. 

Figure  3  sliows  the  trajectory  of  7}  and  tlie  associated  confidence  interval  process  for  tlie  first 
IdO  recirculations.  COearly,  l)y  sample  N*  =  80  we  liave  pas.srd  the  criteria  for  being  in  steady  stale 
according  to  Welsh’s  cro.s.s-replication  confidence  inte.val  method.  Furthermore,  we  can  see  that  anv 
detectable  slope  in  the  mean  process  is  neglegible.  When  tested  for  onr  20  independent  samples,  the 
drift  of  tested  to  be  insignificant  (//n:  no  drift  has  p- value  w  O.d). 

The  mean  time  between  recirculations  i,i  100, 101, . . .,  MO  was  7’  =  0.50,  (confidence  interval 
(0.55330,0.56001)),  clearly  not  as  fast  as  the  /i  =  0.50  wliich  we  know  to  be  the  system’s  capacity. 
Performing  unweighted  R.STS  in  the  first  140  samples  showed  no  Iransifion  to  steady  state  detectable 
-  the  procedure  seemed  to  be  accurate  enough  to  discern  that  tlie  transition  had  not  yet  occured, 
see  figure  4. 
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0  100  200  300  400  500 
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Figure  5:  RSTS  perrorincd  on  the  first.  500  sample  recirnilal ion  limes. 

When  we  extend  the  length  of  the  rnns  we  consider  to  the  full  500  samples,  we  see  that  RS'I'S 
was  able  to  indicate  a  strong  transition  to  steady  stale  around  the  N*  =  3.10  sample,  see  figure 
5.  Averaging  the  samples  collected  in  350,  351, . .  .,500,  we  observe  an  overall  average  of  T  =  0.501 
(confidence  interval  (0.49816,0.50389)). 

RSTS  clearly  dominated  the  other  traditional  initial  transient  methods.  In  the  case  of  the 
recirculating  jobs,  we  clearly  have  a  very  gradual  descent  to  the  steady-state  average.  'I'he  detection 
method  is  not  important  to  our  overall  theme,  Ihough  we  must  issue  a  general  caution;  The  choice 
of  Af*  should  be  made  very  conservatively. 
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3  Non- Work- Conserving  Systems 

In  this  section,  we  consider  the  case  where  work  is  not  conserved  in  the  system,  and  where  the 
measure  of  performance  is  very  general.  This  case,  which  is  much  more  interesting  and  ajiplicaide 
than  tlie  work-conserving  case,  allows  us  to  treat  cases  where  the  service  system  may  sliare  unusual 
characteristics  with  the  real  system,  and  where  the  measure  of  performance  of  the  system  may  be 
dictated  by  the  study  sponsor. 

3.1  Motivating  Example,  Revisited 

In  the  motivating  example,  BOSTs  were  input  to  the  system.  Depending  on  the  BO.S'I  type,  the 
BOST  might 

•  splinter  in‘o  several  commutiicatious  tasks,  which  may  splinter  further  at  a  later  time; 

•  require  partial  or  full  reassembly  at  different  points; 

•  expire  after  is  has  reached  a  certain  age. 

Thus,  the  system  clearly  doesn’t  conserve  work. 

The  sponsor  was  interested  in  specifying 

•  a  mix  of  different  BO.ST  types  which  the  input  was  comj^rised  of; 

•  a  time  allotment  for  each  type  of  BOST,  and  a  lime  when  the  BOSl  expires  and  is  removed 
from  the  system; 

•  a  one-time  cost,  by  POST  type,  assessed  when  the  time  allotnu  nt  expires  with  the  BOST  still 
in  process; 

All  of  these  requirements  were  imposed  because  of  the  need  to  assess  the  system’s  ability  to  handle 
communications  as  diverse  as  artillery  targetling  and  mission  execution  traffic,  medical  evacuation 
requests,  situation  reports,  intelligence  traffic,  logistics  ami  administrativi-  communications,  and 
communications  allowing  the  radio  nets  to  counter  radio  jamming. 
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For  this  application,  wc  constructed  a  penalty  process  p(t),  which  siiperiiiijiused  lateness  and 
expiration  penalties  from  all  of  the  traffic  in  the  system,  and  accumulated  these  penalties  in  [U,t]. 
We  compared  C^l  architectures  based  on  the  rate-of-climb  of  this  penalty  when  traffic  was  inserted 
into  the  system.  The  sponsor  really  wanted  to  know  the  answer  to  the  following  question;  "Which 
architecture  has  the  best  peak-load  performance?"  Since  the  model  was  an  idealization  of  the  real 
system,  and  since  wartime  traffic  load  data  is  not  available,  we  were  faced  with  the  jiroblem  of 
determining  what  traffic  rate  represented  peak  loading  to  the  system. 


3.1.1  The  Workload  Raiup 


Given  that  we  do  not  know  the  service  rate  of  the  system,  wi’  r:in  try  IC'  iircKince  an  e.slimale 
by  modulating  the  intensity  of  the  system’s  input  and  observing  the  eifecl  this  has  on  the  system 
performance.  One  might  attemi't  this  by  stepping  throtigh  some  reasonal>h-  intensity  values,  or 
by  doing  some  sort  of  iterative  search.  After  considering  several  alternatives,  we  ilecided  that  a 
nonhomogenius  input  proceis  with  gradually  increasing  intensity  might  he  approj  riate.  This  idea 
is  similar  to  testing  a  stereo  system  for  its  ability  to  play  loud  music  -  we  gradually  turn  up  the 
volume,  listening  for  the  point  where  tin'  music  begins  to  become  distorted. 

The  mechanics  of  generating  the  ramping  workload  process  and  calculating  the  likelihood  ratio 
for  a  generated  sample  path  are  now  presented.  The  requirement  is  to  generate  a  nonhomogeneous 
Poisson  process  with  jumppoints  i-j ,  j-t,  . . . ,  ia-  from  an  intensity  function  A(t)  given  by 


A(0=  < 


A(0)  +  ri 

0 


t  >  0 
t  <  0 


(9) 


where  A(0)  >  0  is  the  initial  intensity  and  r  is  the  rate  of  climb  of  the  intensity  function.  In  this 
presentation  sign{r)  is  not  specified,  and  is  a  point  of  interest  in  future  research.  Let  A'(f)  be  the 
number  of  tasks  arriving  during  [0,/].  From  the  above  equation,  we  have 


a(f)  =  E[N{t)] 


(10) 
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~  -1- r/- /•_)  (12) 

Thus,  if  . ■I’.v  is  generaU-d  via  (9).  then  a(ji.i’2,  •  ■  •  ,i’,v)  i>  a  I’hi.-miii  |)rtK(>>  wiilj  rate  1, 

[1],  The  scheme  for  generating  our  ramping  workload  process  is  given  as 


To  =  0.0,2  =  1 

while  HOT  DONE 

generate  U  ~  U[0,  l] 

Tj  -  ln(U) 

_1,  ,  -A(0)+yA|lo-’  +  '.'T,r 

'(Tj)  =  - - - 

end  while 

Algorithm  1:  The  generation  of  the  ram|)ing  intensity  workloa.l  jirocess. 


Let  C(x)  =  1  —  6'(j')  be  the  comi>lemenl  of  the  joint  distrilmtion  of  i’l  .  J': . j  ,,  l  lieii 

(a’r,|r,.,(0  =  Pji',  -  Xj.  i  >  ,]  (Id) 

—  ^-[a(f._,  +  (|-a(ri-l)] 

—  p-[A(0)(T,_,  +  ()+r/2(i-._,  +  l)^]-[A((i)|x,_,  )  +  r/2ir,_,  r]  (  j  j 

Thus,  the  conditional  density  function  gx.ij.,_,(0  is  given  by 

«7x.|x...(0  =  +  (Iti) 

=  [A{0)  +  r{t +  (17) 


This  Icist  presentation  of  3x.|x,_,(0  highlights  tlie  nature  of  the  density  function,  with  leading  con¬ 
stant  given  as  A(0)  +  rx^,  the  rate  at  the  time  of  the  generation  epoch,  and  tlie  exponent  giv.  n  as 
-t  times  A(0)  -f  r(x,_i  +  x,)/2. 

When  appropriately  calibrated,  the  ramping  workload  process  will  drive  the  DLSS  into  regimes  in 
whicli  it  is  underutilized,  progres.sing  to  the  point  of  total  utilization,  and  then  becoming  saturated. 


3  SOS-WORK-COSSEliVISG  SYSTEMS 


l(i 


3.2  Detecting  the  Transition  to  Overloaded  for  a  DESS 

Let  our  system  performanre  measure  for  input  ititensity  A  and  tii/ie  /  ha\(  exp'cicd  v/duc  i.  (A,/). 
Our  only  assumptions  about  v  are  that  it  is  r-  ponsive  to  changes  in  A,  it  gruvvs  at  a  rate  similar  to 
a  degree  s  polynomial  when  A  <  /i,  and  it  grows  faster  when  A  >  //. 

Hence,  by  estimating  the  I”  derivative  of  the  performance  measure  (.s  +  1  because  \vi'  would 
like  to  deal  with  a  mean-zero  sequence),  we  ran  produce  a  sequence  with 

•  constant  mean  when  A  <  //, 

•  some  drift  when  A  >  //. 

Let  tj  ,/'j . t,,  be  ;i  evenly  sjiared  points  in  time,  where  A(/i )  is  believed  to  bo  les>  than  the  service 

capacity  //  of  the  DESS,  and  A(/„)  is  believed  to  lie  much  greater  than  //.  Our  iie-ihud  fur  istimating 
DESS  capacity  will 

1.  replicate  the  system  performance  A/  times  using  the  workKuid  ramp  as  the  input  ])roce.ss. 

collecting  data  for  the  i"'  re|>lication: 

2.  form  the  s  +  L”  derivative  data  *(A{/j ), /j  ),y  =  .v -p  l,.v-|-2 . n./=  1,2 . u  using 

sequencial  differences; 

3.  perform  a  transition  detection  to  determine  the  |>oint  ;V*  where  'l'|' ^ A(tj ),  tj )  no  longer 
have  constant  mean  0. 

We  will  propose  and  evaluate  three  methods  for  detecting  the  transition  in  the  performance 
measure: 

•  modified  A'-charts; 

•  RSTS; 

•  adaptive  regression  splines. 

The  application  of  each  of  these  three  methods  will  be  made  more  diflicull  by  a  common  feature  of 
data  we  have  collected  -  although  the  mean  performance  becomes  constant  alter  several  differencing 
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WORKSTATIO.V 

[’[DESTRrCTIO.N'j 

P[C'REATIO\] 

I.N'.SER  l  1().\  .S'TA  TIO.N’ 

left 

0.2 

0.0 

- 

top 

0.1 

0.4 

2 

bottom 

0.2 

0.1 

4 

right 

- 

o.;i 

2 

lal)lc  1:  Dest riirt inti  aiul  C'roalinii  nf 'lasks  williiii  the  DESS.  Tasks  tliai  am  ilrsimvi-il  arn  imi 
ri iiisidfmi  roiii|)lrttHl. 

operations,  the  variance  still  grains.  Hence,  onr  nioilel  of  the  data  from  the  system  wlieii  unsatiirated 

will  he  1,2 . SI ,  J  —  1-^ . "•  "hich  are  iniitnally  itidejieiident .  and  where  K)'  =  0  is 

con.statit  atid  <ry  is  unktiown  atid  assnnied  to  varti. 

Example.  Coutiimed 

We  modified  the  Jackson  network  d<‘scrihed  in  section  2  so  that  it  wits  no  longi  r  work-consf'rving, 
and  so  that  it  exliihited  [iroperties  similar  to  the  communications  system  desrrilieil  above.  'Table  1 
sliows  what  can  haii|)en  to  tasks  in  the  system  when  tliey  complete  servici'  at  each  oTthe  nodes. 

For  this  system,  we  still  have  a  maximum  in|)ut  rate  of  A  =  2,  as  workstation  I  is  not  interfered 
witli,  atid  still  has  two  s<-rvers  serving  at  unit  spee<l.  'The  tasks  for  the  system  are  of  three  cla.vsrs 
All  have  identical  processitig  speeds  at  the  workstations,  but  each  class  pays  a  different  jirice  for 
waiting  in  each  workstation  buffer.  Fitially  eadi  task  can  stay  in  the  DESS  cost-free  for  some  period 
of  time.  After  this  delay,  the  cost  per  unit  time  is  accumulated  based  oti  the  buffer  the  task  resirles 
in.  Each  task  becomes  costless  once  it  leaves  the  system.  All  of  the  data  on  task  clas.ses  is  in  table 
2. 

The  system's  measure  of  performance  is  the  cost  accumulated  from  the  beginnitig  of  the  simu¬ 
lation.  We  subjected  this  system  to  a  ramped  workload  process  which  started  with  insertion  rate 
A(0)  =  1.0  and  climbing  at  a  rale  r  =  0.006GC.  Thus,  the  system  capacUy  is  reached  at  time 
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CLASS 

FREE  TIME 

COST{LEFT) 

COST(TOP) 

COST(  BOTTOM) 

('OS'KHIOIIT) 

1 

2.0 

4 

1 

0 

1 

2 

0.0 

1 

2 

3 

4 

3 

1.0 

3 

2 

1 

0 

TaLile  2:  Free  Time  and  Holding  Costs  for  Task  Classes. 


150  =  A*.  We  continued  each  experiment  to  time  .'{()(). 

As  we  see  from  figure  C,  we  have  the  expected  properties  of  constant  mean  and  growing  variance 
for  the  second  derivative  of  the  accumulated  costs  when  I  <  150.  and  something  else  occurring  when 
t  >  1.50. 

3.2.1  Control  Charting 

The  most  straightforward  methodology  comes  from  the  field  of  quality  assurance.  [5j.  and  hivolves 
the  u.se  of  a  sequencial  hypothesis  test.  We  wish  to  know  the  first  time  that  we  can  conclude  that 
it  is  no  longer  plausible  that  the  response  mean  is  E)'  —  0.  Our  growing  varaince  causes  us  to  take 
one  of  two  actions: 

•  u.se  the  cross-replication  sample  standard  deviation  to  form  a  confidence  interval: 

•  model  the  growth  of  the  standard  deviation  and  use  the  model  when  setting  control  limits. 

Using  the  second  derivative  data  from  our  example  system,  we  can  see  that  the  former  method 
is  inclusive  because  of  constant  false  alarms  (the  lower  control  limit  make.-  several  visits  above  the 
x-axis  during  the  trajectory),  while  using  the  modeled  standard  deviation  cr'^ates  an  envelope  which 
the  mean  stays  inside  even  when  we  know  the  system  is  exhibiting  drift,  see  figure  8. 

3.2.2  n.STS  fui'  De^toclioii  uf  Saluruliuii 


Let  us  return  to  the  construction  of  the  sequencial  R.STS  methods.  In  this  c;isi-.  we  liave  tvvo 
alterations  to  make: 


Figure  6:  Trajectories  of  the  Second  Derivative  of  tlie  Accumulated  (  .'ost  Sam|)|es.  'Hie  center  line  is 
the  mean  of  M  =  20  indepentlent  replications,  while  the  '•urrounding  lines  are  the  upper  and  lower 
normal  confidence  intervals. 
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1.  the  process  must  bo  reversed  to  detect  transitions  out  of  steady  state; 

2.  the  growing  variability  of  the  data  violate  the  assumiit ions  under  whnli  7dM  "H  ‘H,  1:  wa- 
shown  to  be  a  Brownian  bridge  -  the  nnder|)inning  of  the  method. 

Let  the  sequences  A,j.j  =  l,2,...,j  for  eacli  replication  i.  where  is  formt  d  from  the  sub¬ 
sequence  1,2 . j.  Thus,  we  are  moving  tlirough  tlie  out])ut  sequence  in  the  forwanl 

Jlrectioii  (opposite  tin'  usual  STS  for  initialization  bias). 

The  growing  vari.abilily  of  t  he  out  put  must  be  at  tacked  direct  ly.  I'lie  model  u  hn  h  li  is  i  lie  out  |iu  1 
with  A  <  //  is  a  mean-zero  model  with  linearly  increasing  standard  deviation  1  he  expected  nutiile  r 
of  input  tasks,  (i{t).  is  also  ipi.idrat ically  increasing  and  is  intiiiniti.-ly  bnki  d  to  the  growth  of  the 
preformatice  measure  atid  its  variability.  If  we  collect  samples  of  the  cost  function  i  such 

that  there  are  ;i  constatit  exjiectial  number  of  input  tasks  per  sam|de,  wi’  can  avoiil  tin-  prciblem  of 
growitig  variability.  Our  dat;i  will  still  be  formed  by  taking  seqiii'iicial  dilferences.  but  the  ititerv.als 
of  sampling  will  cotitract.  Figure  9  shows  empiricjilly  that  saii.pliiig  this  way  produces  datti  with 
constatit  slandartl  deviation  in  our  examph*.  I'roving  that  this  ti'chniipie  works  in  .all  c.ases  is 
im|)Ossible  because  of  the  lireadth  of  our  generality  here. 

Let  us  establish  the  time  interval  .seipience  t\,t2 . Li  s'leh  that  we  expect  to  inject  exactly 

C  tasks  into  the  system  between  I,  and  L  +  |.i  =  1,2 . u  -  1.  Hence,  given  I,  we  c.an  compute 

fi-t-i  =  t,  +  using 

n(L -1- A/)-a((,)  =  r.  (1«) 


value  C,  we  produce  a  sample  with  constant  standard  deviation.  Taking  the  .seipiencial  differences 
as  above,  we  produce  a  .sequence  which  has  the  properties  required  to  perform  RST.S. 
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Figure  10:  Trajertory  for  RSTS  for  tlie  modified  samjrliiig  («oinls  data  rolleclion.  HoruII:  clear 
transition  at  time  123.0 

In  our  examide  system,  we  used  the  modified  sampling  points  to  ]>roduce  the  F-statistic  trajectory 
shown  in  figure  10.  'Fhe  RSIS  method  gave  a  clear  signal  that  the  system  lost  steady-state  at  time 
123.0,  where  A(/,)  =  1.82  and  p  =  0.910.  We  also  tried  RSTS  with  evenly  spared  points,  and  arheived 
approximately  the  same  result.  From  this  experience,  we  conclude  that  straightforward  R..STS  is 
fairly  robust  with  respect  to  fluctuations  or  growth  in  variability  wlien  the  changes  are  coordinated 
as  they  are  in  this  experiment.  Analytical  investigation  has  shown  that  these  fluctuations  do  not 
caucel  in  the  RSTS  test  statistic. 
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3.2.3  Adaptive  Regression  Splines 

Adaptive  regression  sidiiies,  especially  multivariate  regression  splines,  are  the  focus  of  intense  basic 
research,  see  [8].  In  our  application,  the  regression  spline  required  is  especially  simple,  as  the  mod.'l 
is  of  a  single  independent  variable,  and  we  seek  a  single  knot  in  the  regression  spline  at  the  point 
where  the  zero-mean  model  departs  from  the  data.  Using  the  methods  in  Larson  [4].  we  can  derive 


the  location  of  this  single  knot  analytically. 

The  model  used  is  stated  as 

do  +  d,(/-A-)  +  (,  1<.V* 

>'(0=< 

do  +  ^  ~  A  ’ )  +  ( I  t  >  A  " . 

where  .V‘  is  still  our  point  where  the  regre.ssion  model  changes.  If  we  further  prescribf  that 

do  =  0,  (22) 

Ai=0.  (23) 

we  will  be  fitting  the  model  with  zero  mean  to  the  left  of  the  single  knot.  Lei 

SSEi  =  ^  y-(0:  (^-1) 

(<iV 

SSEn  =  (yin  -  -  A'*))-: 

OA" 

SSE  =  SSEl  +  SSEb  ■  (26) 


Till'  method  involves  the  optimal  location  of  A  *  to  minimize  SSE.  The  procedure  jirovided  in  [4] 
can  be  simplified  for  our  application  into  a  single-pass  examination  of  the  means  of  the  data,  but  is 
valid  only  when  a  linear  model  is  appropriate  for  the  data  to  the  right  of  .V*.  This  method  is  valid 
when  there  is  growing  variability,  as  seen  in  our  application. 

in  our  example,  we  fit  the  modei  in  (2i)  with  three  parameters,  tiien  restricted  it  to  the  mean- 
zero  model  prescribed  in  (22-23).  Figure  11  shows  the  two  regression  models.  The  three  parameter 
model  located  a  knot  at  time  A"  =  136.2.3,  when  the  input  rate  is  1.91  and  the  traffic  intensity  is 
0.9.5.  With  the  one  parameter  restriction,  the  adaptive  spline  located  the  ojitirnal  knot  at  128.0.  We 


In  this  work,  we  have  described  the  various  problems  which  arise  when  we  are  attenijiting  to  de¬ 
termine  the  service  capacity  of  a  black  box  service  system.  We  divided  thi.s  investigation  into  two 
distinct  parts,  one  dealing  with  simple  queueing  systems  which  conserve  work.  In  this  ca.se,  we 
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showed  the  advantages  of  closing  the  system  so  that  output  from  the  sy.sli  in  was  recirculated,  as  the 
lime  lietween  recirculations  converges  to  the  system's  service  rale.  \\V  investigated  ways  to  dete-t 
this  convergence,  and  showed  how  difficult  this  is  in  a  simple  Jackson  niUwork  e.xamph 
The  second  part  of  the  exploration  dealt  witli  systems  which 

•  do  not  conserve  work: 

•  have  their  performance  measured  using  a  general  holding  cost  mechtinism  or  suine  other  per¬ 
formance  measure. 

In  this  case,  we  showed  that  if  we  modulated  the  input  process  usiiig  a  raiujied-iateiisit y  workload 
jirocess,  we  could  drive  the  system  from  underutilized  to  saturate  d.  W'l’  expiofeil  thre.-  methoils 
which  unveil  the  iioint  where'  this  transition  take-s  |dace,  anei  de-meiustrate  el  ettch  on  an  examjile. 

The  wider  significance  of  this  work  is  the  Ix'ginning  of  an  exploration  feor  emiiirietil  methods 
for  determining  tJie  capacity  of  a  service  system.  This  exploration  is  done  not  by  rcjirese'iiting  the 
system  using  a  queuing  model  which  wo  know  liow  to  analyze  a  jiriori.  but  by  Using  a  realistic  meieh'l 
of  the  system  and  measuring  its  performance  in  terms  the  user  has  in  mincl. 
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